neural network layer
Hierarchical Lattice Layer for Partially Monotone Neural Networks
Partially monotone regression is a regression analysis in which the target values are monotonically increasing with respect to a subset of input features. The TensorFlow Lattice library is one of the standard machine learning libraries for partially monotone regression. It consists of several neural network layers, and its core component is the lattice layer. One of the problems of the lattice layer is that it requires the projected gradient descent algorithm with many constraints to train it. Another problem is that it cannot receive a high-dimensional input vector due to the memory consumption. We propose a novel neural network layer, the hierarchical lattice layer (HLL), as an extension of the lattice layer so that we can use a standard stochastic gradient descent algorithm to train HLL while satisfying monotonicity constraints and so that it can receive a high-dimensional input vector. Our experiments demonstrate that HLL did not sacrifice its prediction performance on real datasets compared with the lattice layer.
Hierarchical Lattice Layer for Partially Monotone Neural Networks
Partially monotone regression is a regression analysis in which the target values are monotonically increasing with respect to a subset of input features. The TensorFlow Lattice library is one of the standard machine learning libraries for partially monotone regression. It consists of several neural network layers, and its core component is the lattice layer. One of the problems of the lattice layer is that it requires the projected gradient descent algorithm with many constraints to train it. Another problem is that it cannot receive a high-dimensional input vector due to the memory consumption. We propose a novel neural network layer, the hierarchical lattice layer (HLL), as an extension of the lattice layer so that we can use a standard stochastic gradient descent algorithm to train HLL while satisfying monotonicity constraints and so that it can receive a high-dimensional input vector.
Reviews: Hyperbolic Neural Networks
Thanks to the authors for the detailed response. The new results presented in the rebuttal are indeed convincing, hence I am updating my score to an 8 now. This is with the understanding that these would be incorporated in the revised version of the paper. Several works in the last year have explored using hyperbolic representations for data which exhibits hierarchical latent structure. Some promising results on the efficiency of these representations at capturing hierarchical relationships have been shown, most notably by Nickel & Kiela (Nips, 2017). However one big hindrance for utilizing them so far is the lack of deep neural network models which can consume these representations as input for some other downstream task.
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Fagan, Peter David, Ramamoorthy, Subramanian
Learning from Demonstration (LfD) is a useful paradigm for training policies that solve tasks involving complex motions, such as those encountered in robotic manipulation. In practice, the successful application of LfD requires overcoming error accumulation during policy execution, i.e. the problem of drift due to errors compounding over time and the consequent out-of-distribution behaviours. Existing works seek to address this problem through scaling data collection, correcting policy errors with a human-in-the-loop, temporally ensembling policy predictions or through learning a dynamical system model with convergence guarantees. In this work, we propose and validate an alternative approach to overcoming this issue. Inspired by reservoir computing, we develop a recurrent neural network layer that includes a fixed nonlinear dynamical system with tunable dynamical properties for modelling temporal dynamics. We validate the efficacy of our neural network layer on the task of reproducing human handwriting motions using the LASA Human Handwriting Dataset. Through empirical experiments we demonstrate that incorporating our layer into existing neural network architectures addresses the issue of compounding errors in LfD. Furthermore, we perform a comparative evaluation against existing approaches including a temporal ensemble of policy predictions and an Echo State Network (ESN) implementation. We find that our approach yields greater policy precision and robustness on the handwriting task while also generalising to multiple dynamics regimes and maintaining competitive latency scores.
- Europe > United Kingdom (0.14)
- Asia > China (0.14)
Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers
Apicella, Andrea, Isgrò, Francesco, Prevete, Roberto
In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation "somewhat more linearly separable" than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the appropriate neural network parameters that can perform these transformations is a critical problem. In this paper, we investigate the impact on deep network classifier performances of a training approach favouring solutions where data representations at the hidden layers have a higher degree of linear separability between the classes with respect to standard methods. To this aim, we propose a neural network architecture which induces an error function involving the outputs of all the network layers. Although similar approaches have already been partially discussed in the past literature, here we propose a new architecture with a novel error function and an extensive experimental analysis. This experimental analysis was made in the context of image classification tasks considering four widely used datasets. The results show that our approach improves the accuracy on the test set in all the considered cases.
A New Computationally Simple Approach for Implementing Neural Networks with Output Hard Constraints
Konstantinov, Andrei V., Utkin, Lev V.
A new computationally simple method of imposing hard convex constraints on the neural network output values is proposed. The key idea behind the method is to map a vector of hidden parameters of the network to a point that is guaranteed to be inside the feasible set defined by a set of constraints. The mapping is implemented by the additional neural network layer with constraints for output. The proposed method is simply extended to the case when constraints are imposed not only on the output vectors, but also on joint constraints depending on inputs. The projection approach to imposing constraints on outputs can simply be implemented in the framework of the proposed method. It is shown how to incorporate different types of constraints into the proposed method, including linear and quadratic constraints, equality constraints, and dynamic constraints, constraints in the form of boundaries. An important feature of the method is its computational simplicity. Complexities of the forward pass of the proposed neural network layer by linear and quadratic constraints are O(n*m) and O(n^2*m), respectively, where n is the number of variables, m is the number of constraints. Numerical experiments illustrate the method by solving optimization and classification problems. The code implementing the method is publicly available.
- Asia > Russia (0.14)
- Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
People detection and social distancing classification in smart cities for COVID-19 by using thermal images and deep learning algorithms
Elhanashi, Abdussalam, Saponara, Sergio, Gagliardi, Alessio
COVID-19 is a disease caused by severe respiratory syndrome coronavirus. It was identified in December 2019 in Wuhan, China. It has resulted in an ongoing pandemic that caused infected cases including some deaths. Coronavirus is primarily spread between people during close contact. Motivating to this notion, this research proposes an artificial intelligence system for social distancing classification of persons by using thermal images. By exploiting YOLOv2 (you look at once), a deep learning detection technique is developed for detecting and tracking people in indoor and outdoor scenarios. An algorithm is also implemented for measuring and classifying the distance between persons and automatically check if social distancing rules are respected or not. Hence, this work aims at minimizing the spread of the COVID-19 virus by evaluating if and how persons comply with social distancing rules. The proposed approach is applied to images acquired through thermal cameras, to establish a complete AI system for people tracking, social distancing classification, and body temperature monitoring. The training phase is done with two datasets captured from different thermal cameras. Ground Truth Labeler app is used for labeling the persons in the images. The achieved results show that the proposed method is suitable for the creation of a smart surveillance system in smart cities for people detection, social distancing classification, and body temperature analysis.
Neural network layers as parametric spans
Bergomi, Mattia G., Vertechi, Pietro
Properties such as composability and automatic differentiation made artificial neural networks a pervasive tool in applications. Tackling more challenging problems caused neural networks to progressively become more complex and thus difficult to define from a mathematical perspective. We present a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional), while guaranteeing existence and computability of the layer's derivatives for backpropagation.
Groovy: Detecting objects with Groovy, the Deep Java Library (DJL), and Apache MXNet
This blog posts looks at using Apache Groovy with the Deep Java Library (DJL) and backed by the Apache MXNet engine to detect objects within an image. Deep learning falls under the branches of machine learning and artificial intelligence. It involves multiple layers (hence the "deep") of an artificial neural network. There are lots of ways to configure such networks and the details are beyond the scope of this blog post, but we can give some basic details. We will have four input nodes corresponding to the measurements of our four characteristics.
Meta's 'data2vec' is a step toward One Neural Network to Rule Them All
The race is on to create one neural network that can process multiple kinds of data -- a more-general artificial intelligence that doesn't discriminate about types of data but instead can crunch them all within the same basic structure. The genre of multi-modality, as these neural networks are called, is seeing a flurry of activity in which different data, such as image, text, and speech audio, are passed through the same algorithm to produce a score on different tests such as image recognition, natural language understanding, or speech detection. And these ambidextrous networks are racking up scores on benchmark tests of AI. The latest achievement is what's called "data2vec," developed by researchers at the AI division of Meta (parent of Facebook, Instagram, and WhatsApp). The point, as Meta researcher Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, and Michael Auli reveal in a blog post, is to approach something more like the general learning ability that the human mind seems to encompass.